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Figure 1 
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Figure 4(a) 
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Figure 4(b) 
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Figure 5(a) 
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Figure 5(b) 
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Figure 6(a) 
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Figure 6(b) 
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2320044 

DATA SJmusVAL 8Y8TSH 

The invention relates to a data retrieval system, in particular a 
system for retrieving hierarchically related objects from a relational 
database.. 

This invention i. described in relation to an LDAP .Lightweight 
****** Access Protocol) directory server using a relational database 
a- a backing store. This is a particularly appropriate example of 
hierarchical information which is nonetheless stored in a relational 

T fr0a COllOWin9 d "« 1 * tl ° n < er, that 

the invention is not restricted to LDAP and is of use wherever 

Mererohdcally related objects (data) have to be stored and retrieved 
from a relational database. 

An LDAP directory of the known type comprises a collection of 
hieresohteauy r . lated o^,. ^ Q{ ^ ^ 

The structure of the directory and content of its objects are typi^lly 
determined by the contents of a schema object which is normally itself 
stored in the directory. The contents of this schema object comprise a 
set of object class definitions and a set of structural rules, as shown 
or the above example in Pigure 2. The class definitions include a, a 
list of both mandatory (M ) and optional (0) attributes for each object 
class allowed in the directory, and b) . Ust defining the hierarchical 
relationships between object classes and hence the inheritance rules for 
^definitions. Xn the above example, all object classes other than 
top are subclasses of the class top, thus inheriting the attribute 
ob j sateless. 

The structural rules control the arrangement of objects in the 
directory hierarchy and comprise a list of the allowed child object 

I^V? r* P ~ t Cla " f ° r ° aCh aUCh the naming 

object^ * iB t tin »> ia ™ ™» CWO Provides a unigue name for an 

object that point in the directory hierarchy, its format is thus 
somewhat unpredictable for any object, as it is formed by a cation 
of^one or^re of the object-, attributes and as can be seen ^tne 

sn object at any one point in the tree. LDAP objects also have a 
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unique name in the directory - the distinguished name (DN) . The DN is 
formed by the successive, sequential concatenation of the RDNs of the 
object itself and its parents, bade up to the root of the directory tree- 

Thus, even in the simple case of Figure 1, the RDN for John Doe may 
be init- JDD+ID-005047, and the DN may be orgNaae-lBM, »it«Name-Ari*ona, 
init«JDD+ID»005047i while the RDN for Jane Deer may in fact be 
««plNaii*»Jane Deer and the DN may be orgNaae-IHM, sitaNaae-Aritona, 
eaplName«J«ne Deer. 



The nature of the LDAP data model means that hierarchies may be 
varied and complex; similarly the naming scheme for objects as 
exemplified above also permits substantial variability. As a result, the 
schema definition itself for a directory does not provide a mechanism 
15 t hat can be easily adapted for storage and access of the directory 

contents. Consequently a data retrieval system. is needed whereby objects 
can be assigned to a store with indexing to permit subsequent efficient 
search and retrieval. 



Accordingly, the present invention provides a data retrieval system 
as claimed in claim 1 . 

Embodiments of the invention will now be described with reference 
to the accompanying drawings, in which: 

Figure 1 is a diagram illustrating a conventional LDAP directory; 

Figure 2 is a conventional schema for the LDAP directory of Figure 



1; 



Figure 3 illustrates some hierarchical data; 



Figures 4 la) & 4(b) are conventional index and data tables for the 
hierarchical data of Figure 3; 

Figure 5(a) & 5(b) are index and data tables for use in a data 
retrieval system according to a first embodiment of the invention; and 



40 



Figure 6(a) & 6<b) are index and data tables for use in a data 
retrieval system according to a second embodiment of the invention. 
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The most frequently used search and retrieval operations in the 
LDAP protocol, and for most hierarchical databases, are: 

s 

1. Retrieval of the object from the specification of its UN. Using the 
data of Figure 1, for example, retrieve the object orgHame-diiorosoft, 
prodl t aae lf lndowa3 . 1 . 

2. search of the directory tree, from a parent object specified by DN 
where a value for an attribute is specified, so as to retrieve a subset ' 
of it. immediate children. For example, again using Figure 1. search from 
orgHame-XBX, slteName-Arisona for emplRaae-John. 

3. As above, but to retrieve an object and any descendant of the 
object rather than be restricted to immediate children. For example, 
retrieve oxgwame-zni and all its descendent objects. This may be combined 
with further search criteria, for example, emplName-Jobn Doe. 

The precise details of the mapping of directory objects into tables 
ie:twer P « rt o£ thia disclosure. Although in practice ^ dh ^ act . m be 
mapped into several tables, it is simpler here to assume that all the 
object, are assigned to a single table. It is necessary, however, that 
each object is assigned a unique identifier. Although it is permissible 
to usm,*h. ON for this purpose, the DN is generally Ion, and of varying 
length and therefore somewhat unsuitable. Thus, in the present 
embodiments, a unique numeric identifier is assigned to each object. 

Taking the data of Figure 3 for example, a conventional storage 
scheme whiOh might normally be adopted is shown in Figures 4(e) end 4(b) 
Two ^tible. involved, an index table, Figure 4(a) which maps between ' 
each -object DN (A.B.K...Y) and unique identifier [1...7). and a data 
table which-hold. a row for each object. The data table also holds the 
unique identifer of the object's parent. The data table has columns for 
the various object attributes, including one for the unique object 
identifier, in summary > 

index -tab le (row for each object), 
. rr-.-ebject name (on) 
Object .Identifier 

B«ta .table (row for each object) 

Object attributes (one column for each) 



Object identifier 
Parent object identifier 

Variations on this general theme may be used e.g. the parent 
identifier may be stored in the index table rather than in the data 
table. However all similar schemes have the characteristics that they 
support direct retrieval of an object whose name is known, but do not 
provide for efficient search and/or retrieval for descendants explained 
in point 3 above. For example, if the objects for the database of Figure 
1 are not stored in order in tables of the type shown in Figure 4, then a 
search for the descendants of IBM, could only retrieve descendants of the 
Arizona object stored sequentially after the location of the Arizona 
object. The search engine would then need to traverse the table again to 
find all children of the Arizona object stored before the Arizona object 
in the table, and so on for these children. 

A retrieval system operating on such an index and data table must 
effectively navigate the hierarchy, thereby resulting in many sequential 
operations and traverses of the data table. If the tables are implemented 
in a relational database this prohibits the use of the relational 
operators to conduct a full descendants search in a Bingle traverse of 
the table. 

The data retrieval system according to the invention overcomes 
these limitations. 

For each object, its position in the hierarchy can be described by 
the sequential concatenation of its own unique in the directory 
identifier with those of their parents, taken in sequence. This 
collection of values will be termed the position key. Note that in the 
ldap case, the position key would generally not be the same as the DN, 
which is formed by the concatenation of unique at that branch of the tree 
identifiers. 

in a first embodiment of the invention, Figures 5(a) and 5(b), the 
position key (pkl, pk2, pk3) for an object is preferably stored as a set 
of values in a suitable number of columns assigned in the both the index 
and data tables (one column being required for each level in the 
directory tree hierarchy) . The advantage of the position key is that the 
object data is now very easily searched on a hierarchical basis; thus if 
any descendants of a particular object are required, the position key of 
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the parent can be used in a search condition on an object table (see the 
•xample below) . Moreover the key can also be used to control the .level C f 
searching, if only the immediate descendants are required then the ..arch 
above can be further restricted by requiring that all other columns 
assigned to the position, save that for the level of the immediate 
descendant, be null, variations on this allow control of the depth of the 
search. 

Using the tables of Figure 5 (a) and 5 (b) i 

1. A is found by a select on the data table where > identif ier-1 . 

2. The immediate children of A are found by a select on the data table 
where: pfcl-1 a pk2t-null a pic 3 -null . 



3 

pki 



AH descendants of A are found by a select on the data table where, 
-1 * pki I -null, waerei 
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pkl«l ft pk2*3 « pk3J-null 

where j- designates the boolean operator -not equal-. 

_ ;| A Se00nd «*>odiment. Figures 6<a> * 6 (b) , overcomes the problem of 
the first embodiment taking significant storage in the index tin.." 
will be seen that is not necessary to use all the columns of the position 
*.y to achieve the defllred effect . ^ J~*ti« 

.ul^umn furthest from the root, is significant in identifying^ 
object, the higher level column, contain redundant data. Horeover th. 
value- e„ thi. column is the unique object identifier itself 
According. i« index table, Figure 6(a). instead of adding the 

TIZT ^ " °° atftnt8 DreViOU81 ^ it is sufficient to add 

a single column, entitled Leva!, containing the level of that object in 

T^r^r: lndlcata8 ^ *** — - - 

the no!i t r° !"* tat>le ' FigU " 6<W ' * 1V<m toe formation content of 

It rrun^etjet «~ '""^ ^ ™^ 

e unique object and parent identifiers are no longer needed. 

The tables are now: 
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index table (row for each object) t 

Object name 

Object identifier 

Object level in directory tree 

Data table (row for each object) 

Object attributes lone column for each) 

Position key identifier (one column for each level in the 
hierarchy) 

Using the tables of Figures 6(a) & 6 (b) i. 

1. a is found by a select on the index table where; identif ier-1. 

2. The immediate children of A are found by a select on the data table 
where: pkl«l e pk2-!null a pk3-null. 

3. All descendants of A are found by a select on the data table where: 
pkl-1 ft pk2«lnull. 

4. The children of M are found by a select on the data table where: 
pk2»3 ft pkS-lnull. 

Thus, in comparison with the conventional tables of Figure 4, the 
index table of the second embodiment contains one extra column, while the 
data table of the second embodiment contains extra columns for two less 
than the number of levels in the hierarchy. However search operations are 
now reduced to a simple traverse of a the data table, instead of the 
complex sequence of selects previously required. 
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"• CLAIMS 

1. 



A data retrieval system in which a plurality of objects having a 
mulei -level hierarchical relationship are stored, each object having a 
respective parent and a set of children, said system including an index 
table comprising a respective name and associated identifier for each 
ObJ«ctv*nd a data table comprising a respective set of attributes and a 
position key associated with each object in the system, each position key 
c cmpcisla g-.a- series of components, each component corresponding to a 
lartOha* the hierarchy, a first component of said key storing the 
identifier of an associated object, and each successive component storing 
the ridentif ier of the parent of the object stored in the previous 



2. A data retrieval system as claimed in claim 1 in which said index 
tab** includes an attribute storing the respective level of each Object 
in the system. 

3. A data retrieval system as claimed in claim 1 in which said index 
table comprises a position key associated with each object in the system, 
and said data table includes a respective identifier associated with each 
object in the system and a respective identifier of the parent of each 
object in the system. 

25 *' * method of retrieving an object name for an object stored in the 

data retrieval system claimed in claim 1 or 2, comprising the steps of: 

a. specifying an identifier of the object; and 

b. retrieving from the index table the name for the object whose 
- identifier matches said specified identifier. 



5 



A method of retrieving the immediate children of an Object stored 
in the data retrieval system claimed in claim 2 comprising the steps of, 
- a,- specifying an identifier of the object; 

b * a ** r< * ia » the *»«•* table to ascertain the hierarchical level of 
35 said object; 

^selecting from the data table objects where the contents of the 
Position key component for said level matches the Identifier of the 
^abject t 

*• f ° r " id cblect « wh «« contents of the position key • 
component for the hierarchical level below said level are not null 
and where the contents of the position key component for the next 
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hierarchical level below said level are null, retrieving the 
respective object identifiers. 

6. A method of retrieving all descendants of an object stored in a 
data retrieval system claimed in claim 2. comprising the steps of: 

a. specifying an identifier of the object; 

b. searching the index table to ascertain the hierarchical level of 
said object; 

c. selecting from the data table objects where the contents of the 
position key component for said level matches the identity of the 
ob j ect ; 

d for said objects where the contents of the position key 
component for the level below said level are not null, retrieving 
the respective object identifiers. 



7. a method of retrieving an object name for an object stored in the 
data retrieval system claimed in claim 3, comprising the steps of: 

a. specifying an identifier of the object? and 

b. retrieving from the data table the name for the object whose 
identifier matches said identifier. 

8 A method of retrieving the immediate children of an object stored 
in the data retrieval system claimed in claim 3 comprising the steps of: 

a. specifying an identifier of the object; and 

b. for objects where the contents of the parent identifier match 
the identifier of the object, retrieving the respective object- 
identifiers. 

9. A method of retrieving all descendants of an object stored in a 
data retrieval system claimed in claim 3, comprising the steps of: 

a. specifying an identifier of the object; 

b. ascertaining the level of the object according to when a first 
component of said position key matches the identifier of said 
object; 

c. selecting from the data table objects where the contents of the 
position key component for said level matches the identity of the 
ob j ect ; and 

d. for said objects where the contents of the position key 
component for the level below said level are not nulls retrieving 

40 the respective object identifiers. 
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9 

10. A method as claimed in claims 5, 6 or 9 where steps c> and d) are 
carriad out in a single traverse of said data table. 

11. A method as claimed in claims 5, 6, 8, 9 or 10 where the retrieved 
object i<§«ntifiere are stored in the first component of respective 
position keys. 
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